Reproducible Research

MSL meeting

Tuomas Eerola

February 28, 2023

What is Reproducible Research?

What is Reproducible Research?

“A study is reproducible if there is a specific set of computational functions/analyses that exactly reproduce all of the numbers and data visualizations in a published paper from raw data. Reproducibility does not require independent data collection and instead uses the methods and data collected by the original investigator.” (Marwick, 2016, p. 4)

Why Reproducible Research?

Important

Crisis on replication and transparency in empirical research

in Nature (Baker, 2016)

1 – Credibility Crisis

  • “Why most published research findings are false” (Ioannidis, 2005)
  • Psychology has lion share of bias/misleading results (Simmons, Nelson, & Simonsohn, 2011)
  • Science: Serious concern about robustness of results
    • 100 studies replicated, about 36% produced statistically significant results (OSC, 2015)

Why Reproducible Research?

“Many Labs 2” Project Results in November 2018

  • \(\approx 50\%\) psychology findings replicated
  • Small effect sizes
  • Consistency across labs
  • Alternative explanations controlled for
  • Made transparent via OSF and Psyarxiv

Why Irreproducible Research?

  • We don’t know better
  • We have pressure to publish
  • There is no incentive to produce reproducible research
  • Selective reporting drives results (cherry picking, p-hacking, only reporting positive results)
  • It keeps us artificially in the business
    • we are the only ones who have/can process/claim ownership on the data/method/protocol
    • “it preserves the competitive edge”

Benefits of Reproducibility

Benefits of Reproducibility (1)

  • Comply with the demands for transparency

    • First in computer sciences and biosciences
    • Next in social sciences (Asendorpf et al., 2013)
      • Note: Open Access Data is required by all RCUK funding
    • Some journals promote transparency
    • Pre-registration of study beforehand

Benefits of Reproducibility (2)

  • To collaborate more easily and effectively

    • Spot mistakes

    • Encourage learning and exploration

  • To communicate research more clearly

  • To raise awareness of quality concerns

Reproducible Research

Article – is just a tip of the iceberg. Reproducible Research makes the whole workflow accessible.

Types of Reproducible Research

3 kinds of reproducibility (Stodden, Leisch, & Peng, 2014):

  • Computational reproducibility: code, software, hardware and implementation details.

  • Empirical reproducibility: detailed information about non-computational empirical scientific experiments and observations. Basically the data.

  • Statistical reproducibility: detailed information is provided about the choice of statistical tests.

Reproducible Music Research?

Reproducible Music Research?

Music Research is interdisciplinary, and some disciplines are closer to demands of reproducibility than others

  • Music Information Retrieval (MIR) leads the way (soundsoftware.ac.uk)

  • Music Psychology follows the path (interest in replication, but very slow take up (study about this coming up)

  • Music Intervention research (has to register study protocols)

  • Computational music analysis/theory/ethnomusicology utilise corpus studies

How to Achieve Reproducible Research?

Workflows that are reproducible and transparent

    1. Design (sometimes requiring pre-registration)
    1. Analysis (using tools that allow reproduction)
    1. Data (data, analysis pipelines, repositories, etc.)
    1. Reporting (linking data, analysis, outputs)

1. Designs – Pre-registration

  • The study details (research questions, methods, recruitment, stimuli, analysis details, inferences) are defined and submitted to peer review (Registered Report)
    • If a passed, In-Principle Acceptance (IPA)
    • Collect the data, follow the protocol, 2nd peer review
  • Coming to our field (we have pioneered several of these)

Pros and Cons of Pre-registration

Positives

  • Improves quality (definitions, design, measures, analysis)
  • Positive review experience
    • less at stake without data
    • collaborative mode

Negatives

  • Takes time (additional planning + review)
  • Reveals study plans to others (but safely so)
  • Innovation is considered more valuable that reliability boost

2. Analysis Tools and Reproducibility

  • Music analysis
    • Yes: Python (music21, librosa), R (incon, humdrumR)
    • No: Sonic Visualiser, sequencers
  • Statistics
    • Yes: R / Jamovi / Python
    • No: SPSS
  • Reporting
    • Yes: RMarkdown, Quarto, Jupyter notebooks
    • No: Microsoft Word

3. Data Sharing and Analysis Pipeline

  • Repositories (Github, OSF, Dataverse, Zenodo, etc.)

Note

Sharing can be done anonymously for review (e.g. https://anonymous.4open.science, drupal.org, OSF)

4. Reporting – Reproducible Workflows

Ways to Promote Reproducibility

Ways to Promote Reproducibility

  • Require reproducibility from PhD students ☑️
  • Run replication studies in teaching ☑️
  • Make it one of your themes in lab meetings ☑️
  • Steer collaborations into reproducible workflows ☑️

“Reproducibility is like brushing your teeth. It is good for you, but it takes time and effort. Once you learn it, it becomes a habit.” (Baker, 2016)

The End

slides available at:

https://github.com/tuomaseerola/talks

References

Armitage, J., & Eerola, T. (2022). Cross-modal transfer of valence or arousal from music to word targets in affective priming? Auditory Perception & Cognition, 5(3-4), 192–210. https://doi.org/https://doi.org/10.1080/25742442.2022.2087451
Asendorpf, J. B., Conner, M., De Fruyt, F., De Houwer, J., Denissen, J. J., Fiedler, K., et al.others. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27(2), 108–119.
Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454.
Eerola, T., & Lahdelma, I. (2022). Register impacts perceptual consonance through roughness and sharpness. Psychonomic Bulletin and Review, 29, 800–808. https://doi.org/10.3758/s13423-021-02033-5
Hardwicke, T. E., Wallach, J. D., Kidwell, M., Bendixen, T., Crüwell, S., & Ioannidis, J. P. A. (2019). An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014-2017). https://doi.org/10.31222/osf.io/6uhg5
Ioannidis, J. P. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124.
Lahdelma, I., & Eerola, T. (2022). Registered report - valenced priming with acquired affective concepts in music: Automatic reactions to common tonal chords. Music Perception.
Marwick, B. (2016). Computational reproducibility in archaeological research: Basic principles and a case study of their implementation. Journal of Archaeological Method and Theory, 1–27.
OSC. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. Plos One, 2(3), e308.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359 1366. https://doi.org/10.1177/0956797611417632
Stodden, V., Leisch, F., & Peng, R. D. (2014). Implementing reproducible research. CRC Press.